2020-02-24

Learning Objectives

  1. Plot vector geospatial data (points/polygons)
  2. Plot raster geospatial data (image)
  3. Extracting data from raster
  4. Making maps (base maps, compass rose, overlays, etc)

Motivation

We are researchers who want to answer the question “What range of environmental conditions do zebra tend to inhabit?”

General workflow:

  1. Plot locations of zebra
  2. Plot map of environmental conditions
  3. Plot contextual information
  4. Extract data about environmental conditions at those locations

Variants of this question/workflow are applicable to other study systems.

Motivation

To answer our research question, we will need to:

  • Read and write spatial data
  • Represent geographic and attribute data
  • Transform between different models of Earth
  • Do geographic operations
  • Make maps EN: Do we need this content or does it arise pretty natually from our questions and then get explained by our demos?

Spatial analysis in R

Representation of spatial data

  • Vector data model uses points, lines, and polygons (e.g. locations of animals, ecosystem types) and can include nonspatial attribute data
    • How many counts at a point?
    • Length of a transect
    • Land-cover type of a polygon
  • Raster data model divides surface into cells of constant size (e.g. gridded temperature)

R packages for spatial analysis

Part 1: Plotting vector data

Working with sf objects

  • sf (simple feature) objects are extended data.frames or tibbles
    • attribute data stored as tibble
    • geometry stored as a list column
    • can have multiple types of geometry in one object
  • class is sfc (simple feature columns) with useful components like
    • bounding box bbox
    • CRS (coordinate reference system) attributes

sf and tidyverse

  • sf functions begin with st_
  • Methods for summary, plot
  • sf methods for filter, arrange, distinct, group_by, ungroup, mutate

Look at all the methods…

methods(class = "sf")
##   [1] $<-                        [                         
##   [3] [[<-                       aggregate                 
##   [5] anti_join                  arrange                   
##   [7] as.data.frame              cbind                     
##   [9] coerce                     dbDataType                
##  [11] dbWriteTable               df_spatial                
##  [13] distinct                   extent                    
##  [15] extract                    filter                    
##  [17] full_join                  gather                    
##  [19] group_by                   group_map                 
##  [21] group_split                identify                  
##  [23] initialize                 inner_join                
##  [25] left_join                  mask                      
##  [27] merge                      mutate                    
##  [29] nest                       plot                      
##  [31] print                      raster                    
##  [33] rasterize                  rbind                     
##  [35] rename                     right_join                
##  [37] sample_frac                sample_n                  
##  [39] select                     semi_join                 
##  [41] separate                   show                      
##  [43] slice                      slotsFromS3               
##  [45] spread                     st_agr                    
##  [47] st_agr<-                   st_area                   
##  [49] st_as_sf                   st_bbox                   
##  [51] st_boundary                st_buffer                 
##  [53] st_cast                    st_centroid               
##  [55] st_collection_extract      st_convex_hull            
##  [57] st_coordinates             st_crop                   
##  [59] st_crs                     st_crs<-                  
##  [61] st_difference              st_force_polygon_cw       
##  [63] st_geometry                st_geometry<-             
##  [65] st_interpolate_aw          st_intersection           
##  [67] st_intersects              st_is                     
##  [69] st_is_polygon_cw           st_line_merge             
##  [71] st_linesubstring           st_make_valid             
##  [73] st_minimum_bounding_circle st_nearest_points         
##  [75] st_node                    st_normalize              
##  [77] st_point_on_surface        st_polygonize             
##  [79] st_precision               st_segmentize             
##  [81] st_set_precision           st_simplify               
##  [83] st_snap                    st_snap_to_grid           
##  [85] st_split                   st_subdivide              
##  [87] st_sym_difference          st_transform              
##  [89] st_transform_proj          st_triangulate            
##  [91] st_union                   st_voronoi                
##  [93] st_wrap_dateline           st_write                  
##  [95] st_zm                      summarise                 
##  [97] transmute                  ungroup                   
##  [99] unite                      unnest                    
## see '?methods' for accessing help and source code

Reading shapefiles

roads <- st_read(here("data", "enproads.shp"))
## Reading layer `enproads' from data source `C:\Users\User\Desktop\QERM597-making-maps\data\enproads.shp' using driver `ESRI Shapefile'
## Simple feature collection with 565 features and 3 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: 14.35337 ymin: -19.48418 xmax: 17.15124 ymax: -18.5008
## epsg (SRID):    NA
## proj4string:    NA
class(roads)
## [1] "sf"         "data.frame"
head(roads)
## Simple feature collection with 6 features and 3 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: 14.46841 ymin: -19.41752 xmax: 14.58964 ymax: -19.33077
## epsg (SRID):    NA
## proj4string:    NA
##     TYPE LENGTH KM                       geometry
## 1  Track    0.7  1 LINESTRING (14.50502 -19.35...
## 2  Track    3.6  4 LINESTRING (14.49077 -19.36...
## 3 Graded    2.3  2 LINESTRING (14.4864 -19.417...
## 4  Track    7.1  7 LINESTRING (14.58498 -19.34...
## 5  Track    1.2  1 LINESTRING (14.58498 -19.34...
## 6 Graded    0.1  0 LINESTRING (14.52596 -19.33...


# Highlight parts
# 1. list column (aka sfc)
# 2. feature geometry (sfg)
# 3. feature (row)

# Default methods for objects
plot(roads)

plot(filter(roads[,3], KM > 20))

Other data sources

  • Packages such as rnaturalearth
  • Convert from other Spatial* objects using st_as_sf

Data from packages

world <- ne_countries(scale = "medium", returnclass = "sf")
st_geometry(world) %>% plot()

Namibia

Default data is Lat/Long

Namibia <- ne_countries(country = "namibia")
Namibia_sf <- st_as_sf(Namibia)

plot(Namibia_sf)

* Exercise: try for another country

Want more from the natural earth package?

#Uncomment here if using rnatural earth for the first time: 
#devtools::install_github("ropenscilabs/rnaturalearthdata", force = TRUE)
#install.packages("rnaturalearthhires", repos = "http://packages.ropensci.org")

namibia_political <- ne_states(country = "namibia")
namibia_political_sf <- st_as_sf(namibia_political)
plot(namibia_political_sf)

Coordinates

  • st_crs() used to convert coordinates (do projections) and do datum transformations
  • Proj.4 string or ESPG code contains this info
  • ESPG code: public registry of spatial reference systems
  • Web repositories for ESPG and Proj.4 information
  • ESPG codes identify coordinate reference systems (CRS)
    • Ex: CRS= 3857 (web-based mapping i.e. Google, OpenStreetMap)
  • Lat/Long data:
    • Often expressed as WGS84: World Geodic System 1984
    • Ex: CRS= 4326 (coordinate system based on Earth’s center of mass)
  • ESPG = European Petroleum Survey Group

st_crs("+proj=longlat +datum=WGS84") # Proj.4 string"
## Coordinate Reference System:
##   EPSG: 4326 
##   proj4string: "+proj=longlat +datum=WGS84 +no_defs"
st_crs(3857)                         # ESPG = 3857
## Coordinate Reference System:
##   EPSG: 3857 
##   proj4string: "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs"
st_crs(3857)$units                   # check units
## [1] "m"
st_crs(4326)
## Coordinate Reference System:
##   EPSG: 4326 
##   proj4string: "+proj=longlat +datum=WGS84 +no_defs"
st_crs(4326)$units
## NULL
st_crs(NA)                          # unknown (assumed planar/Cartesian)
## Coordinate Reference System: NA

Change the units!

units::set_units(a1, km^2)
## 824886.6 [km^2]
units::set_units(a1, ft^2)
## 8.879005e+12 [ft^2]

From csv to an sf object

zebra_data <- read.csv(here("data", "zebra.csv"))
head(zebra_data)
##   X.1   X  Name            Date Longitude  Latitude Speed Direction Temp
## 1  50  78 AG059 4/20/2009 15:40  15.89605 -19.12138     0         0    0
## 2  34  60 AG059 4/20/2009 13:01  15.91854 -19.17761     0         0    0
## 3  33  59 AG059 4/20/2009 12:02  15.91857 -19.17766     0         0    0
## 4  62  90 AG059 4/20/2009 17:00  15.80543 -19.08002     0         0    0
## 5 109 159 AG059  4/21/2009 1:40  15.91887 -19.17758     0       270    0
## 6 107 155 AG059  4/21/2009 1:00  15.91882 -19.17760     0       270    0
##   Altitude PDOP    ID         DatePOS diff       day
## 1     1123    2 AG059 4/20/2009 15:40   19 4/20/2009
## 2     1113    3 AG059 4/20/2009 13:01   59 4/20/2009
## 3     1130    3 AG059 4/20/2009 12:02  721 4/20/2009
## 4     1139    2 AG059 4/20/2009 17:00   17 4/20/2009
## 5     1133    2 AG059  4/21/2009 1:40   40 4/21/2009
## 6     1126    4 AG059  4/21/2009 1:00   20 4/21/2009

Converting the object

zebra_sf <- read.csv(here("data", "Zebra.csv")) %>% 
  dplyr::select(ID = Name, 4:6) %>% 
  mutate(timestamp = as.POSIXct(lubridate::mdy_hm(Date))) %>%
  st_as_sf(., coords = 3:4, crs = "+init=epsg:4326") 

st_transform(zebra_sf, 32733) #convert to UTM
## Simple feature collection with 11490 features and 3 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 546746.9 ymin: 7852220 xmax: 691330.5 ymax: 7912185
## epsg (SRID):    32733
## proj4string:    +proj=utm +zone=33 +south +datum=WGS84 +units=m +no_defs
## First 10 features:
##       ID            Date           timestamp                 geometry
## 1  AG059 4/20/2009 15:40 2009-04-20 15:40:00 POINT (594243.6 7885501)
## 2  AG059 4/20/2009 13:01 2009-04-20 13:01:00   POINT (596576 7879266)
## 3  AG059 4/20/2009 12:02 2009-04-20 12:02:00 POINT (596579.5 7879260)
## 4  AG059 4/20/2009 17:00 2009-04-20 17:00:00   POINT (584733 7890124)
## 5  AG059  4/21/2009 1:40 2009-04-21 01:40:00 POINT (596611.3 7879269)
## 6  AG059  4/21/2009 1:00 2009-04-21 01:00:00 POINT (596605.8 7879266)
## 7  AG059  4/21/2009 5:01 2009-04-21 05:01:00 POINT (596604.9 7879271)
## 8  AG059  4/21/2009 4:41 2009-04-21 04:41:00 POINT (596610.9 7879268)
## 9  AG059  4/22/2009 4:00 2009-04-22 04:00:00 POINT (596603.6 7879269)
## 10 AG059  4/22/2009 2:01 2009-04-22 02:01:00 POINT (596609.4 7879272)

Extra thing that blew my mind! Interactive Maps in R?

countries <- ne_countries(scale=110)
p <- ggplot() +  geom_polygon(data=countries, aes(x=long, y=lat, group=group),  color="white", lwd = .25)
 
ggplotly(p)

Writing data

  • shapefiles
  • database connections
  • use st_write

Exercises

  1. Load the roads shapefile (enproads.shp) and upload Zebra.csv and plot the geometry.
  2. If we look at our roads data, we see that there are several types of roads in the shapefile. How close or far from different road types do zebra move?

Exercise 1

roads <- st_read(here("data", "enproads.shp"), crs = "+init=epsg:4326") %>% #4326= coordinate system based on Earth's center of mass
  st_transform(32733) #32733 = spatial reference for Nambia 
## Reading layer `enproads' from data source `C:\Users\User\Desktop\QERM597-making-maps\data\enproads.shp' using driver `ESRI Shapefile'
## Simple feature collection with 565 features and 3 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: 14.35337 ymin: -19.48418 xmax: 17.15124 ymax: -18.5008
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs

st_geometry(roads) %>% plot

Exercise 1

zebra_sf <- read.csv(here("data", "Zebra.csv")) %>% 
  dplyr::select(ID = Name, 4:6) %>% 
  mutate(timestamp = as.POSIXct(lubridate::mdy_hm(Date))) %>%
  st_as_sf(., coords = 3:4, crs = "+init=epsg:4326") %>% #longlat, converting foreign object to SF object
  st_transform(32733) #convert to UTM

ggplot() +
  geom_sf(data=roads) +
  geom_sf(data=zebra_sf, aes(color=ID))

Exercise 2

How close or far from different road types do zebra move?

unique(roads$TYPE)
## [1] Track  Graded Gravel Tar   
## Levels: Graded Gravel Tar Track

#Filter roads based off type:
large_roads <- filter(roads, TYPE %in% c("Tar", "Gravel"))
small_roads <- filter(roads, TYPE %in% c("Graded", "Track"))

ggplot() +
  geom_sf(data=large_roads, size=1.5) + 
  geom_sf(data=small_roads, size=0.6) + 
  geom_sf(data=zebra_sf, aes(color=ID))

Exercise 2:

# Find the minimum distance (in meters) of each point to a large road
large_dist<- st_distance(y=zebra_sf, x=large_roads) # a units matrix dim =[nrow(x), nrow(y)]; takes 10-20 seconds
zebra_sf$large_road_dist <- apply(large_dist, 2, min)

# Find the minimum distance (in meters) of each point to a small road
small_dist<- st_distance(y=zebra_sf, x=small_roads) # a units matrix dim =[nrow(x), nrow(y)]; takes 10-20 seconds
zebra_sf$small_road_dist <- apply(small_dist, 2, min)

head(data.frame(zebra_sf))
##      ID            Date           timestamp                 geometry
## 1 AG059 4/20/2009 15:40 2009-04-20 15:40:00 POINT (594243.6 7885501)
## 2 AG059 4/20/2009 13:01 2009-04-20 13:01:00   POINT (596576 7879266)
## 3 AG059 4/20/2009 12:02 2009-04-20 12:02:00 POINT (596579.5 7879260)
## 4 AG059 4/20/2009 17:00 2009-04-20 17:00:00   POINT (584733 7890124)
## 5 AG059  4/21/2009 1:40 2009-04-21 01:40:00 POINT (596611.3 7879269)
## 6 AG059  4/21/2009 1:00 2009-04-21 01:00:00 POINT (596605.8 7879266)
##   large_road_dist small_road_dist
## 1        7.845860       3479.8987
## 2      163.187201        241.8660
## 3      156.746071        245.8446
## 4        9.115608       1605.2154
## 5      151.769751        227.9759
## 6      151.358102        231.6984

Exercise 2:

ggplot(zebra_sf) +
  geom_histogram(aes(large_road_dist, fill="large roads"), alpha=0.5) +
  geom_histogram(aes(small_road_dist, fill="small roads"), alpha=0.5) +
  scale_fill_manual(labels=c("large roads", "small roads"), values=c("blue", "orange"))

PART II: PLOTTING RASTER DATA AND EXTRACTING INFORMATION

DATA: 1. taxon point locations, 2. relevant shapefile (Ecosystem type?, country?)

Reading raster data

## class      : RasterLayer 
## dimensions : 474, 1274, 603876  (nrow, ncol, ncell)
## resolution : 231.6564, 231.6564  (x, y)
## extent     : 431873.8, 727004, 7844489, 7954294  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=33 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : C:/Users/User/Desktop/QERM597-making-maps/data/ndvi_mean_utm.tif 
## names      : ndvi_mean_utm 
## values     : -558.117, 5303.021  (min, max)
## class      : RasterLayer 
## dimensions : 474, 1274, 603876  (nrow, ncol, ncell)
## resolution : 231.6564, 231.6564  (x, y)
## extent     : 431873.8, 727004, 7844489, 7954294  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=33 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : C:/Users/User/Desktop/QERM597-making-maps/data/ndvi_mean_utm.tif 
## names      : ndvi_mean_utm 
## values     : -558.117, 5303.021  (min, max)
## class      : RasterLayer 
## dimensions : 474, 1274, 603876  (nrow, ncol, ncell)
## resolution : 231.6564, 231.6564  (x, y)
## extent     : 431873.8, 727004, 7844489, 7954294  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=33 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : C:/Users/User/Desktop/QERM597-making-maps/data/ndvi_mean_utm.tif 
## names      : ndvi_mean_utm 
## values     : -558.117, 5303.021  (min, max)

Converting projection

Writing raster data

Extracting data

EXERCISES

  1. Load the NDVI raster file and plot. Overlay Zebra location
  2. Extract NDVI at zebra locations
  3. Plot zebra location data. Color points according to NDVI values

Exercise 1

## class      : RasterLayer 
## dimensions : 474, 1274, 603876  (nrow, ncol, ncell)
## resolution : 231.6564, 231.6564  (x, y)
## extent     : 431873.8, 727004, 7844489, 7954294  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=33 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : memory
## names      : ndvi_mean_utm 
## values     : -0.0558117, 0.5303021  (min, max)

Exercise 2

##      ID            Date           timestamp                 geometry
## 1 AG059 4/20/2009 15:40 2009-04-20 15:40:00 POINT (594243.6 7885501)
## 2 AG059 4/20/2009 13:01 2009-04-20 13:01:00   POINT (596576 7879266)
## 3 AG059 4/20/2009 12:02 2009-04-20 12:02:00 POINT (596579.5 7879260)
## 4 AG059 4/20/2009 17:00 2009-04-20 17:00:00   POINT (584733 7890124)
## 5 AG059  4/21/2009 1:40 2009-04-21 01:40:00 POINT (596611.3 7879269)
## 6 AG059  4/21/2009 1:00 2009-04-21 01:00:00 POINT (596605.8 7879266)
##   large_road_dist small_road_dist NDVI_mean
## 1        7.845860       3479.8987 0.2409661
## 2      163.187201        241.8660 0.2682169
## 3      156.746071        245.8446 0.2682169
## 4        9.115608       1605.2154 0.2380762
## 5      151.769751        227.9759 0.2419201
## 6      151.358102        231.6984 0.2419201

#####Acknowledge existance of multidimenional data and packages. netCDF files

PART III: ADDING BASE MAPS

Adding basemaps!

  • Focus on ggmap
  • As of mid-2018, requires registering with Google and obtaining an API key
  • Requires providing a valid credit card (yikes!), though charges are nonexistent or at least very minimal
  • See the ggmap GitHub page for more information about API keys

Basemap options from ggmap

  • Basemaps available from Google, Stamen, or Open Street Map
  • Terrain, satellite, or watercolor
  • See this helpful cheat sheet

Making maps using ggmap

Two steps:

  • Download basemap raster
  • Plot raster and overlay other spatial data

Register API key at start of session

This is my personal API key, which you can use for the purposes of this class!

register_google(key = "AIzaSyBRkw_wzDKuWZDikdD86Kmp1Sa6PzuCKFc")

Stamen basemaps

To add a Stamen basemap, first define the bounding box for the basemap you would like to download.

zebra_bbox <- c(14, -20, 17.5, -18)
names(zebra_bbox) <- c("left", "bottom", "right", "top")

Stamen basemaps: terrain

terr_basemap <- get_stamenmap(bbox=zebra_bbox, zoom=9, maptype="terrain")

ggmap(terr_basemap) +
  geom_sf(data=roads, inherit.aes = FALSE) + 
  geom_sf(data=zebra_sf, aes(color=ID), inherit.aes = FALSE) +
  coord_sf(crs = st_crs(4326))

Stamen basemaps: watercolor

wat_basemap <- get_stamenmap(bbox=zebra_bbox, zoom=9, maptype="watercolor")

ggmap(wat_basemap) +
  geom_sf(data=roads, inherit.aes = FALSE) + 
  geom_sf(data=zebra_sf, aes(color=ID), inherit.aes = FALSE) +
  coord_sf(crs = st_crs(4326))

Google basemaps

To add a Google basemap, first define the center coordinates for the basemap you would like to download.

zebra_center <- c(15.8, -19)
names(zebra_center) <- c("lon", "lat")

Google basemaps: satellite

sat_basemap <- get_googlemap(center=zebra_center, zoom=8, size=c(640, 640), scale=2,
                             maptype="satellite")
ggmap(sat_basemap) +
  geom_sf(data=roads, inherit.aes = FALSE) + 
  geom_sf(data=zebra_sf, aes(color=ID), inherit.aes = FALSE) +
  coord_sf(crs = st_crs(4326))

Google basemaps: satellite

Notice that get_googlemap() returns a square basemap tile, which then sets the plotting limits of your map.

To manually set different plotting limits, pass in an empty “base layer” and set x and y limits using ggplot().

Google basemaps: satellite

ggmap(sat_basemap, maprange=FALSE, base_layer=ggplot(aes(x=1, y=1), data=NULL)) +
  xlim(14.3, 17.4) + ylim(-20, -18) + xlab("lon") + ylab("lat") +
  geom_sf(data=roads, inherit.aes = FALSE) + 
  geom_sf(data=zebra_sf, aes(color=ID), inherit.aes = FALSE) +
  coord_sf(crs = st_crs(4326))

Adding basemaps: alternative approaches

  • One alternative package is RgoogleMaps
    • No need to get Google API key (yay!)
    • Does not seems to be compatible with ggplot2 (boo)
  • Another approach is to download raster data directly from Natural Earth
    • Have to download manually from website and import tif file as a raster brick, then display red/green/blue values
    • Only works well if you are mapping a large spatial extent (lacks fine resolution)

PART IV: Making it pretty for publication

Things to tweak

  • Fix the legend
  • Add a scale bar and north arrow
  • Annotations
  • Inset maps

Natural earth package to access other data

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\User\AppData\Local\Temp\RtmpYZTJFu", layer: "ne_10m_rivers_lake_centerlines"
## with 1455 features
## It has 34 fields
## Integer64 fields read as strings:  rivernum ne_id
## Warning in rgdal::readOGR(destdir, file_name, encoding = "UTF-8",
## stringsAsFactors = FALSE, : Dropping null geometries: 1453
## although coordinates are longitude/latitude, st_intersection assumes that they are planar
## Warning: attribute variables are assumed to be spatially constant throughout all
## geometries

Fix the legend

Add scale bar and north arrow

Add annotation

Thank you!

Additional Resources